fix(pure): Make Pure FlashArray HTTP client timeout configurable by MikeAnders08 · Pull Request #5551 · kubev2v/forklift

MikeAnders08 · 2026-03-19T12:27:48Z

Make Pure FlashArray HTTP client timeout configurable

Problem:

During migrations of VMs with many disks, simultaneous CopyVolume requests to Pure FlashArray were timing out, leaving PVCs stuck in Pending. In one observed case, 15 disks were migrated but only 7 reached Bound status — the remaining 8 populator pods failed with:

failed to copy VMDK using VVol storage API: copy operation failed: Pure FlashArray CopyVolume failed:
failed to send copy volume request: Post "https://<array>/api/2.46/volumes?overwrite=true":
context deadline exceeded (Client.Timeout exceeded while awaiting headers)

The root cause is that the HTTP client timeout was hardcoded to 30 seconds with no way to extend it, making it impossible to accommodate slower or heavily-loaded arrays.

Changes:

NewRestClient now accepts an httpTimeoutSeconds int parameter instead of a hardcoded value. A value of <= 0 falls back to the 30s default.
NewFlashArrayClonner threads the parameter through to NewRestClient.
A --storage-api-timeout-seconds CLI flag (default: 30) is added to the vsphere-xcopy-volume-populator binary.

How to configure:

Pass --storage-api-timeout-seconds=<value> to the populator binary. Full operator-side wiring (CRD field → VSphereXcopyPluginConfig → VSphereXcopyVolumePopulatorSpec → populator-controller pod args) is a follow-up.

Default behaviour is unchanged — the timeout remains 30 seconds unless explicitly overridden.

rgolangh · 2026-03-19T13:25:55Z

cmd/vsphere-xcopy-volume-populator/vsphere-xcopy-volume-populator.go

 	flag.StringVar(&vspherePassword, "vsphere-password", os.Getenv("GOVMOMI_PASSWORD"), "vSphere's API password")
 	flag.StringVar(&esxiCloneMethod, "esxi-clone-method", os.Getenv("ESXI_CLONE_METHOD"), "ESXi clone method: 'vib' (default) or 'ssh'")
 	flag.IntVar(&sshTimeoutSeconds, "ssh-timeout-seconds", 30, "SSH timeout in seconds for ESXi operations (default: 30)")
+	flag.IntVar(&storageAPITimeoutSeconds, "storage-api-timeout-seconds", 30, "HTTP client timeout in seconds for storage API requests (default: 30)")


We need to make the default os.GetEnv("STORAGE_HTTP_TIMEOUT_SECONDS") instead of 30 and that will allow that configurtion to be passed as part of the storage secret in the storageMap. Otherwise this is hard to use.
Also please add that entry in cmd/vsphere-xcopy-volume-popualtor/README.md under the STORAGE_ secret keys

Both done in the latest commit 2a7932c:

storageAPITimeoutSeconds is now a StringVar with os.Getenv("STORAGE_HTTP_TIMEOUT_SECONDS") as default, same pattern as the other STORAGE_* vars. strconv.Atoi handles the conversion at the call site with a warning log for bad values, and the <= 0 guard in NewRestClient keeps the 30s fallback.

Added STORAGE_HTTP_TIMEOUT_SECONDS to the secret keys table in README.

rgolangh · 2026-03-19T14:47:51Z

the DCO check is failing - please add you git signature

MikeAnders08 · 2026-03-19T14:54:11Z

the DCO check is failing - please add you git signature

Done

cmd/vsphere-xcopy-volume-populator/vsphere-xcopy-volume-populator.go

codecov-commenter · 2026-03-20T14:45:21Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 10.10%. Comparing base (f1fe5d0) to head (7dfad56).
⚠️ Report is 2045 commits behind head on main.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5551      +/-   ##
==========================================
- Coverage   15.45%   10.10%   -5.35%     
==========================================
  Files         112      500     +388     
  Lines       23377    57429   +34052     
==========================================
+ Hits         3613     5804    +2191     
- Misses      19479    51144   +31665     
- Partials      285      481     +196

Flag	Coverage Δ
unittests	`10.10% <ø> (-5.35%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

rgolangh · 2026-03-22T14:03:20Z

cmd/vsphere-xcopy-volume-populator/vsphere-xcopy-volume-populator.go

 	case forklift.StorageVendorProductPureFlashArray:
+		apiTimeout, err := strconv.Atoi(storageAPITimeoutSeconds)
+		if err != nil && storageAPITimeoutSeconds != "" {
+			klog.Warningf("invalid value %q for storage-api-timeout-seconds, using default (30s): %v", storageAPITimeoutSeconds, err)


In the warning, change it to the new flag name

rgolangh · 2026-03-22T14:04:21Z

We are close, there's another small comment, and also please pull rebase

…GE_HTTP_TIMEOUT_SECONDS Resolves: None Signed-off-by: Michael Jons <Michael.Jons@tre.se>

MikeAnders08 · 2026-03-23T08:35:15Z

We are close, there's another small comment, and also please pull rebase

Done

Resolves: None Signed-off-by: Michael Jons <Michael.Jons@tre.se>

sonarqubecloud · 2026-03-24T14:28:31Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

rgolangh · 2026-03-24T17:52:06Z

/backport release-2.11

github-actions · 2026-03-24T17:52:15Z

🔄 Starting backport of PR #5551 to release-2.11
🚀 Live mode
View run

github-actions · 2026-03-24T17:52:59Z

✅ PR #5551 backported to release-2.11.

…nfigurable (#5615) **Backport:** #5551 **Make Pure FlashArray HTTP client timeout configurable** **Problem:** During migrations of VMs with many disks, simultaneous `CopyVolume` requests to Pure FlashArray were timing out, leaving PVCs stuck in `Pending`. In one observed case, 15 disks were migrated but only 7 reached `Bound` status — the remaining 8 populator pods failed with: ``` failed to copy VMDK using VVol storage API: copy operation failed: Pure FlashArray CopyVolume failed: failed to send copy volume request: Post "https://<array>/api/2.46/volumes?overwrite=true": context deadline exceeded (Client.Timeout exceeded while awaiting headers) ``` The root cause is that the HTTP client timeout was hardcoded to 30 seconds with no way to extend it, making it impossible to accommodate slower or heavily-loaded arrays. **Changes:** - `NewRestClient` now accepts an `httpTimeoutSeconds int` parameter instead of a hardcoded value. A value of `<= 0` falls back to the 30s default. - `NewFlashArrayClonner` threads the parameter through to `NewRestClient`. - A `--storage-api-timeout-seconds` CLI flag (default: `30`) is added to the `vsphere-xcopy-volume-populator` binary. **How to configure:** Pass `--storage-api-timeout-seconds=<value>` to the populator binary. Full operator-side wiring (CRD field → `VSphereXcopyPluginConfig` → `VSphereXcopyVolumePopulatorSpec` → populator-controller pod args) is a follow-up. **Default behaviour is unchanged** — the timeout remains 30 seconds unless explicitly overridden. --------- Signed-off-by: Michael Jons <Michael.Jons@tre.se> Co-authored-by: Michael Jons <Michael.Jons@tre.se>

MikeAnders08 requested review from mnecas and rgolangh as code owners March 19, 2026 12:27

rgolangh reviewed Mar 19, 2026

View reviewed changes

MikeAnders08 force-pushed the main branch from ee49e20 to 65f33de Compare March 19, 2026 13:57

rgolangh added backport-release-2.11 This label will trigger a backport to 2.11 once the PR is merged storage-offload labels Mar 19, 2026

MikeAnders08 force-pushed the main branch 2 times, most recently from d36ce72 to 6b5eabf Compare March 19, 2026 14:53

rgolangh reviewed Mar 19, 2026

View reviewed changes

cmd/vsphere-xcopy-volume-populator/vsphere-xcopy-volume-populator.go Outdated Show resolved Hide resolved

MikeAnders08 changed the title ~~feat: Make Pure FlashArray HTTP client timeout configurable~~ fix(pure): Make Pure FlashArray HTTP client timeout configurable Mar 20, 2026

rgolangh force-pushed the main branch from c49edcc to bf5fba4 Compare March 20, 2026 14:35

rgolangh reviewed Mar 22, 2026

View reviewed changes

feat: make Pure FlashArray HTTP client timeout configurable via STORA…

f3d2719

…GE_HTTP_TIMEOUT_SECONDS Resolves: None Signed-off-by: Michael Jons <Michael.Jons@tre.se>

MikeAnders08 force-pushed the main branch 2 times, most recently from 93f44aa to 9b43830 Compare March 23, 2026 08:34

fix(pure): correct log message for storage HTTP timeout configuration

7dfad56

Resolves: None Signed-off-by: Michael Jons <Michael.Jons@tre.se>

MikeAnders08 force-pushed the main branch from 9b43830 to 7dfad56 Compare March 24, 2026 14:27

rgolangh merged commit 3cfdd48 into kubev2v:main Mar 24, 2026
13 of 14 checks passed

github-actions bot mentioned this pull request Mar 24, 2026

[release-2.11] fix(pure): Make Pure FlashArray HTTP client timeout configurable #5615

Merged

Conversation

MikeAnders08 commented Mar 19, 2026

Uh oh!

rgolangh Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

MikeAnders08 Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rgolangh commented Mar 19, 2026

Uh oh!

MikeAnders08 commented Mar 19, 2026

Uh oh!

Uh oh!

codecov-commenter commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

rgolangh Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

rgolangh commented Mar 22, 2026

Uh oh!

MikeAnders08 commented Mar 23, 2026

Uh oh!

sonarqubecloud bot commented Mar 24, 2026

Quality Gate passed

Uh oh!

Uh oh!

rgolangh commented Mar 24, 2026

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

github-actions bot commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MikeAnders08 Mar 19, 2026 •

edited

Loading

codecov-commenter commented Mar 20, 2026 •

edited

Loading